{"id":36842,"date":"2020-12-11T11:00:00","date_gmt":"2020-12-11T11:00:00","guid":{"rendered":"https:\/\/www.vmengine.net\/2020\/12\/11\/aws-server-issues-in-northern-virginia-what-was-the-problem\/"},"modified":"2025-05-23T17:24:31","modified_gmt":"2025-05-23T17:24:31","slug":"aws-server-issues-in-northern-virginia-what-was-the-problem","status":"publish","type":"post","link":"http:\/\/temp_new.vmenginelab.com\/en\/2020\/12\/11\/aws-server-issues-in-northern-virginia-what-was-the-problem\/","title":{"rendered":"AWS Server Issues in Northern Virginia: What Was the Problem?"},"content":{"rendered":"<div class=\"et_pb_section et_pb_section_209 et_section_regular\" >\n<div class=\"et_pb_row et_pb_row_299\">\n<div class=\"et_pb_column et_pb_column_4_4 et_pb_column_298  et_pb_css_mix_blend_mode_passthrough et-last-child\">\n<div class=\"et_pb_module et_pb_text et_pb_text_589  et_pb_text_align_left et_pb_bg_layout_light\">\n<div class=\"et_pb_text_inner\">\n<h5><em><span style=\"font-weight: 400;\">It was Wednesday, November 25, a day like any other in Northern Virginia, in the southeastern United States, when Amazon Web Service suffered a service outage causing significant problems to many online services.<\/span><\/em><\/h5>\n<\/div><\/div>\n<div class=\"et_pb_module et_pb_text et_pb_text_590  et_pb_text_align_left et_pb_bg_layout_light\">\n<div class=\"et_pb_text_inner\">\n<p><span style=\"font-weight: 400;\">After accurately and meticulously analyzing the issue, from the Seattle headquarters they said that the outage only occurred in the Northern Virginia region specifically after a &#8220;small addition of capacity&#8221; to its front-end fleet of Kinesis servers.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is no small inconvenience if we consider the fact that Amazon Kinesis, an AWS tool that allows the real-time processing of streaming data, in addition to its direct use by customers, is used by large companies such as Adobe Spark, Roku, Flickr or Autodesk. This means that almost all of the major cloud-based software apps that rely on Amazon Kinesis for their back-end have been affected by the disruption.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Suffice it to say, in fact, that the problems also affected cryptocurrency portals that failed to process transactions and streaming and podcast services that limited users&#8217; access to their accounts. Among the sites that have reported issues on the DownDetector page are services such as Ring, Prime Music, Pokemon Go, Roku, MeetUp.com, League of Legends, Anchestry.com, Chime, and more.<\/span><\/p>\n<\/div><\/div>\n<div class=\"et_pb_module et_pb_image et_pb_image_243 et_animated et-waypoint\">\n<p>\t\t\t\t<a href=\"https:\/\/aws.amazon.com\/it\/kinesis\/\" target=\"_blank\"><span class=\"et_pb_image_wrap \"><img fetchpriority=\"high\" decoding=\"async\" width=\"595\" height=\"321\" src=\"https:\/\/temp_new.vmenginelab.com\/wp-content\/uploads\/2020\/12\/kinesis-min.png\" alt=\"\" title=\"\"  sizes=\"(max-width: 595px) 100vw, 595px\" class=\"wp-image-32410\" srcset=\"http:\/\/temp_new.vmenginelab.com\/wp-content\/uploads\/2020\/12\/kinesis-min.png 595w, http:\/\/temp_new.vmenginelab.com\/wp-content\/uploads\/2020\/12\/kinesis-min-300x162.png 300w\" \/><\/span><\/a>\n\t\t\t<\/div>\n<div class=\"et_pb_module et_pb_text et_pb_text_591  et_pb_text_align_left et_pb_bg_layout_light\">\n<div class=\"et_pb_text_inner\">\n<p><span style=\"font-weight: 400;\">According to the Cloud giant, <\/span><b>the outage happened after a &#8220;small addition of capacity&#8221; to its front-end fleet of Kinesis servers. <\/b> <\/p>\n<p><span style=\"font-weight: 400;\">&#8220;<\/span><i><br \/>\n  <span style=\"font-weight: 400;\">The triggering factor, although not the main cause of the event, <\/span><br \/>\n<\/i><span style=\"font-weight: 400;\">&#8211; the company is keen to point out &#8211; <\/span> <i><span style=\"font-weight: 400;\">It was a relatively small addition of capacity that began to be added to the service at 2:44 a.m., ending at 3:47 a.m. Kinesis has a large number of back-end cell clusters that process the streams. These are Kinesis&#8217; workhorses, providing deployment, access, and scalability for stream processing. Streams are disseminated to the back-end via a sharding mechanism owned by a fleet of front-end servers. A backend cluster possesses many fragments and provides a consistent scaling unit and fault isolation. The work of the front end is small but important. Manages authentication, throttling, and routing of requests to the correct stream-shards on back-end clusters<\/span><\/i><span style=\"font-weight: 400;\">\u201d.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8220;<\/span><i><br \/>\n  <span style=\"font-weight: 400;\">At 9:39 a.m.<\/span><br \/>\n<\/i><span style=\"font-weight: 400;\"> ,&#8221; they continue <\/span> <i><span style=\"font-weight: 400;\">We were able to confirm that the root cause was not due to memory pressure. Rather, the new capacity had caused the maximum number of threads allowed by an operating system configuration to be exceeded on all servers in the fleet. When this limit was exceeded, cache construction was not completed and front-end servers were left with useless fragmented maps that made them unable to route requests to back-end clusters<\/span><\/i><span style=\"font-weight: 400;\">\u201d.<\/span><\/p>\n<\/div><\/div>\n<div class=\"et_pb_module et_pb_text et_pb_text_592  et_pb_text_align_left et_pb_bg_layout_light\">\n<div class=\"et_pb_text_inner\">\n<p><span style=\"font-weight: 400;\">In short, the problem would have been triggered by <strong>the desire to increase the capacity of the system<\/strong>. The attempt to add new servers to Amazon&#8217;s dominant Cloud Computing network triggered a series of cascading errors that caused problems for several online services.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Acknowledging one&#8217;s mistakes, however, is essential and in this case the Cloud giant was quick to apologize to its customers. &#8220;We will do everything we can to learn from this event and use it to improve further,&#8221; they said.<\/span><\/p>\n<\/div><\/div>\n<div class=\"et_pb_module et_pb_cta_99 et_animated et_pb_promo  et_pb_text_align_center et_pb_bg_layout_dark\">\n<div class=\"et_pb_promo_description\">\n<h3 class=\"et_pb_module_header\">Want to find out how to prevent this from happening to your business?  <\/h3>\n<\/div>\n<div class=\"et_pb_button_wrapper\"><a class=\"et_pb_button et_pb_promo_button\" href=\"https:\/\/temp_new.vmenginelab.com\/en\/contacts\/\" target=\"_blank\">Talk to one of our consultants now<\/a><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>It was Wednesday, November 25, a day like any other in Northern Virginia, in the southeastern United States, when Amazon Web Service suffered a service outage causing significant problems to many online services. After accurately and meticulously analyzing the issue, from the Seattle headquarters they said that the outage only occurred in the Northern Virginia [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":32417,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[97,2297],"tags":[4289,4290,3596,4291,4292,4293,4294,4295],"class_list":["post-36842","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog-en","category-news-en","tag-avoiding-aws-downs","tag-aws-availability-zones","tag-aws-cloud-support","tag-aws-devops-kinesis","tag-aws-region-down-en","tag-aws-region-virginia-en","tag-down-aws-en","tag-kinesis-aws-how-to-en"],"aioseo_notices":[],"jetpack_featured_media_url":"http:\/\/temp_new.vmenginelab.com\/wp-content\/uploads\/2020\/12\/thumbnail-1.jpg","amp_enabled":true,"_links":{"self":[{"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/posts\/36842","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/comments?post=36842"}],"version-history":[{"count":1,"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/posts\/36842\/revisions"}],"predecessor-version":[{"id":41525,"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/posts\/36842\/revisions\/41525"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/media\/32417"}],"wp:attachment":[{"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/media?parent=36842"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/categories?post=36842"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/temp_new.vmenginelab.com\/en\/wp-json\/wp\/v2\/tags?post=36842"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}